Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 421 | 428 |
| Missing cells (%) | 7.9% | 8.0% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Parch is highly overall correlated with SibSp | Alert not present in this dataset | High Correlation |
Sex is highly overall correlated with Survived | Sex is highly overall correlated with Survived | High Correlation |
SibSp is highly overall correlated with Parch | Alert not present in this dataset | High Correlation |
Survived is highly overall correlated with Sex | Survived is highly overall correlated with Sex | High Correlation |
Age has 76 (17.0%) missing values | Age has 89 (20.0%) missing values | Missing |
Cabin has 345 (77.4%) missing values | Cabin has 338 (75.8%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 298 (66.8%) zeros | SibSp has 304 (68.2%) zeros | Zeros |
Parch has 332 (74.4%) zeros | Parch has 337 (75.6%) zeros | Zeros |
Fare has 6 (1.3%) zeros | Fare has 8 (1.8%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-09-07 10:04:45.831026 | 2024-09-07 10:04:49.001479 |
| Analysis finished | 2024-09-07 10:04:48.998110 | 2024-09-07 10:04:52.186942 |
| Duration | 3.17 seconds | 3.19 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 446.81166 | 440.6009 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 3 | 1 |
| Maximum | 891 | 890 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 3 | 1 |
| 5-th percentile | 48 | 48.25 |
| Q1 | 216.25 | 227.25 |
| median | 440.5 | 420.5 |
| Q3 | 680.75 | 660.75 |
| 95-th percentile | 850.75 | 847.75 |
| Maximum | 891 | 890 |
| Range | 888 | 889 |
| Interquartile range (IQR) | 464.5 | 433.5 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 261.04917 | 257.07717 |
| Coefficient of variation (CV) | 0.58424879 | 0.58346947 |
| Kurtosis | -1.258845 | -1.176954 |
| Mean | 446.81166 | 440.6009 |
| Median Absolute Deviation (MAD) | 233 | 221 |
| Skewness | 0.041702545 | 0.074890775 |
| Sum | 199278 | 196508 |
| Variance | 68146.67 | 66088.672 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 204 | 1 | 0.2% |
| 299 | 1 | 0.2% |
| 403 | 1 | 0.2% |
| 889 | 1 | 0.2% |
| 833 | 1 | 0.2% |
| 442 | 1 | 0.2% |
| 108 | 1 | 0.2% |
| 672 | 1 | 0.2% |
| 389 | 1 | 0.2% |
| 152 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 843 | 1 | 0.2% |
| 794 | 1 | 0.2% |
| 170 | 1 | 0.2% |
| 325 | 1 | 0.2% |
| 764 | 1 | 0.2% |
| 141 | 1 | 0.2% |
| 444 | 1 | 0.2% |
| 332 | 1 | 0.2% |
| 857 | 1 | 0.2% |
| 568 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 3 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 | |
| 18 | 1 | |
| 19 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 |
| Value | Count | Frequency (%) |
| 3 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 | |
| 18 | 1 | |
| 19 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 0 |
| 2nd row | 0 | 0 |
| 3rd row | 0 | 0 |
| 4th row | 0 | 1 |
| 5th row | 0 | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 163 |
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 163 |
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 163 |
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 163 |
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 163 |
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 163 |
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 1 |
| 2nd row | 3 | 3 |
| 3rd row | 3 | 3 |
| 4th row | 3 | 1 |
| 5th row | 3 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 107 | |
| 2 | 87 | 19.5% |
| Value | Count | Frequency (%) |
| 3 | 236 | |
| 1 | 118 | |
| 2 | 92 | 20.6% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 107 | |
| 2 | 87 | 19.5% |
| Value | Count | Frequency (%) |
| 3 | 236 | |
| 1 | 118 | |
| 2 | 92 | 20.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 107 | |
| 2 | 87 | 19.5% |
| Value | Count | Frequency (%) |
| 3 | 236 | |
| 1 | 118 | |
| 2 | 92 | 20.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 107 | |
| 2 | 87 | 19.5% |
| Value | Count | Frequency (%) |
| 3 | 236 | |
| 1 | 118 | |
| 2 | 92 | 20.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 107 | |
| 2 | 87 | 19.5% |
| Value | Count | Frequency (%) |
| 3 | 236 | |
| 1 | 118 | |
| 2 | 92 | 20.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 252 | |
| 1 | 107 | |
| 2 | 87 | 19.5% |
| Value | Count | Frequency (%) |
| 3 | 236 | |
| 1 | 118 | |
| 2 | 92 | 20.6% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 61 | 82 |
| Median length | 49 | 49 |
| Mean length | 26.681614 | 26.804933 |
| Min length | 13 | 13 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 11900 | 11955 |
| Distinct characters | 59 | 59 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Saalfeld, Mr. Adolphe | Hoyt, Mr. William Fisher |
| 2nd row | Jussila, Miss. Mari Aina | Ling, Mr. Lee |
| 3rd row | Johnston, Miss. Catherine Helen "Carrie" | Sage, Mr. George John Jr |
| 4th row | Saad, Mr. Amin | Carter, Mrs. William Ernest (Lucile Polk) |
| 5th row | Hampe, Mr. Leon | Boulos, Mrs. Joseph (Sultana) |
| Value | Count | Frequency (%) |
| mr | 270 | 15.0% |
| miss | 87 | 4.8% |
| mrs | 60 | 3.3% |
| william | 25 | 1.4% |
| john | 21 | 1.2% |
| master | 17 | 0.9% |
| charles | 17 | 0.9% |
| george | 13 | 0.7% |
| henry | 12 | 0.7% |
| james | 11 | 0.6% |
| Other values (908) | 1269 |
| Value | Count | Frequency (%) |
| mr | 253 | 14.0% |
| miss | 99 | 5.5% |
| mrs | 62 | 3.4% |
| william | 35 | 1.9% |
| john | 19 | 1.0% |
| master | 18 | 1.0% |
| henry | 17 | 0.9% |
| george | 16 | 0.9% |
| james | 13 | 0.7% |
| thomas | 12 | 0.7% |
| Other values (895) | 1266 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1357 | 11.4% | |
| r | 948 | 8.0% |
| a | 823 | 6.9% |
| e | 810 | 6.8% |
| n | 678 | 5.7% |
| i | 662 | 5.6% |
| s | 650 | 5.5% |
| M | 554 | 4.7% |
| l | 537 | 4.5% |
| o | 529 | 4.4% |
| Other values (49) | 4352 |
| Value | Count | Frequency (%) |
| 1365 | 11.4% | |
| r | 974 | 8.1% |
| e | 884 | 7.4% |
| a | 843 | 7.1% |
| i | 677 | 5.7% |
| n | 661 | 5.5% |
| s | 657 | 5.5% |
| M | 559 | 4.7% |
| o | 490 | 4.1% |
| l | 484 | 4.0% |
| Other values (49) | 4361 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 11900 |
| Value | Count | Frequency (%) |
| (unknown) | 11955 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1357 | 11.4% | |
| r | 948 | 8.0% |
| a | 823 | 6.9% |
| e | 810 | 6.8% |
| n | 678 | 5.7% |
| i | 662 | 5.6% |
| s | 650 | 5.5% |
| M | 554 | 4.7% |
| l | 537 | 4.5% |
| o | 529 | 4.4% |
| Other values (49) | 4352 |
| Value | Count | Frequency (%) |
| 1365 | 11.4% | |
| r | 974 | 8.1% |
| e | 884 | 7.4% |
| a | 843 | 7.1% |
| i | 677 | 5.7% |
| n | 661 | 5.5% |
| s | 657 | 5.5% |
| M | 559 | 4.7% |
| o | 490 | 4.1% |
| l | 484 | 4.0% |
| Other values (49) | 4361 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 11900 |
| Value | Count | Frequency (%) |
| (unknown) | 11955 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1357 | 11.4% | |
| r | 948 | 8.0% |
| a | 823 | 6.9% |
| e | 810 | 6.8% |
| n | 678 | 5.7% |
| i | 662 | 5.6% |
| s | 650 | 5.5% |
| M | 554 | 4.7% |
| l | 537 | 4.5% |
| o | 529 | 4.4% |
| Other values (49) | 4352 |
| Value | Count | Frequency (%) |
| 1365 | 11.4% | |
| r | 974 | 8.1% |
| e | 884 | 7.4% |
| a | 843 | 7.1% |
| i | 677 | 5.7% |
| n | 661 | 5.5% |
| s | 657 | 5.5% |
| M | 559 | 4.7% |
| o | 490 | 4.1% |
| l | 484 | 4.0% |
| Other values (49) | 4361 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 11900 |
| Value | Count | Frequency (%) |
| (unknown) | 11955 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1357 | 11.4% | |
| r | 948 | 8.0% |
| a | 823 | 6.9% |
| e | 810 | 6.8% |
| n | 678 | 5.7% |
| i | 662 | 5.6% |
| s | 650 | 5.5% |
| M | 554 | 4.7% |
| l | 537 | 4.5% |
| o | 529 | 4.4% |
| Other values (49) | 4352 |
| Value | Count | Frequency (%) |
| 1365 | 11.4% | |
| r | 974 | 8.1% |
| e | 884 | 7.4% |
| a | 843 | 7.1% |
| i | 677 | 5.7% |
| n | 661 | 5.5% |
| s | 657 | 5.5% |
| M | 559 | 4.7% |
| o | 490 | 4.1% |
| l | 484 | 4.0% |
| Other values (49) | 4361 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.6726457 | 4.735426 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2084 | 2112 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | male |
| 2nd row | female | male |
| 3rd row | female | male |
| 4th row | male | female |
| 5th row | male | female |
Common Values
| Value | Count | Frequency (%) |
| male | 296 | |
| female | 150 |
| Value | Count | Frequency (%) |
| male | 282 | |
| female | 164 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 296 | |
| female | 150 |
| Value | Count | Frequency (%) |
| male | 282 | |
| female | 164 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 596 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 150 | 7.2% |
| Value | Count | Frequency (%) |
| e | 610 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 164 | 7.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2084 |
| Value | Count | Frequency (%) |
| (unknown) | 2112 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 596 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 150 | 7.2% |
| Value | Count | Frequency (%) |
| e | 610 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 164 | 7.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2084 |
| Value | Count | Frequency (%) |
| (unknown) | 2112 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 596 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 150 | 7.2% |
| Value | Count | Frequency (%) |
| e | 610 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 164 | 7.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2084 |
| Value | Count | Frequency (%) |
| (unknown) | 2112 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 596 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 150 | 7.2% |
| Value | Count | Frequency (%) |
| e | 610 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 164 | 7.8% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 74 | 69 |
| Distinct (%) | 20.0% | 19.3% |
| Missing | 76 | 89 |
| Missing (%) | 17.0% | 20.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.623649 | 29.714734 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.75 | 0.75 |
| Maximum | 74 | 70 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.75 | 0.75 |
| 5-th percentile | 6 | 3 |
| Q1 | 20 | 21 |
| median | 28.5 | 28 |
| Q3 | 37.75 | 38 |
| 95-th percentile | 56.55 | 56.4 |
| Maximum | 74 | 70 |
| Range | 73.25 | 69.25 |
| Interquartile range (IQR) | 17.75 | 17 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.15058 | 14.511798 |
| Coefficient of variation (CV) | 0.47767849 | 0.48837045 |
| Kurtosis | 0.290494 | -0.11158716 |
| Mean | 29.623649 | 29.714734 |
| Median Absolute Deviation (MAD) | 8.5 | 8 |
| Skewness | 0.47303077 | 0.24958798 |
| Sum | 10960.75 | 10608.16 |
| Variance | 200.2389 | 210.59228 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 22 | 16 | 3.6% |
| 18 | 16 | 3.6% |
| 19 | 16 | 3.6% |
| 25 | 14 | 3.1% |
| 21 | 14 | 3.1% |
| 24 | 13 | 2.9% |
| 30 | 13 | 2.9% |
| 29 | 13 | 2.9% |
| 32 | 12 | 2.7% |
| 33 | 12 | 2.7% |
| Other values (64) | 231 | |
| (Missing) | 76 | 17.0% |
| Value | Count | Frequency (%) |
| 22 | 19 | 4.3% |
| 28 | 16 | 3.6% |
| 25 | 14 | 3.1% |
| 36 | 13 | 2.9% |
| 21 | 12 | 2.7% |
| 18 | 12 | 2.7% |
| 27 | 11 | 2.5% |
| 29 | 10 | 2.2% |
| 32 | 10 | 2.2% |
| 34 | 10 | 2.2% |
| Other values (59) | 230 | |
| (Missing) | 89 | 20.0% |
| Value | Count | Frequency (%) |
| 0.75 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 5 | |
| 3 | 4 | |
| 4 | 5 | |
| 6 | 3 | |
| 7 | 2 | 0.4% |
| 8 | 2 | 0.4% |
| 9 | 3 | |
| 10 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.75 | 2 | 0.4% |
| 0.83 | 2 | 0.4% |
| 1 | 5 | |
| 2 | 6 | |
| 3 | 4 | |
| 4 | 5 | |
| 5 | 1 | 0.2% |
| 6 | 2 | 0.4% |
| 8 | 2 | 0.4% |
| 9 | 3 |
| Value | Count | Frequency (%) |
| 0.75 | 2 | 0.4% |
| 0.83 | 2 | 0.4% |
| 1 | 5 | |
| 2 | 6 | |
| 3 | 4 | |
| 4 | 5 | |
| 5 | 1 | 0.2% |
| 6 | 2 | 0.4% |
| 8 | 2 | 0.4% |
| 9 | 3 |
| Value | Count | Frequency (%) |
| 0.75 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 5 | |
| 3 | 4 | |
| 4 | 5 | |
| 6 | 3 | |
| 7 | 2 | 0.4% |
| 8 | 2 | 0.4% |
| 9 | 3 | |
| 10 | 1 | 0.2% |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.5470852 | 0.54484305 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 298 | 304 |
| Zeros (%) | 66.8% | 68.2% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 3 | 3 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 1.0918608 | 1.1866326 |
| Coefficient of variation (CV) | 1.9957784 | 2.1779347 |
| Kurtosis | 15.79564 | 17.948964 |
| Mean | 0.5470852 | 0.54484305 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.406358 | 3.7998006 |
| Sum | 244 | 243 |
| Variance | 1.19216 | 1.4080969 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 107 | 24.0% |
| 2 | 14 | 3.1% |
| 3 | 12 | 2.7% |
| 4 | 11 | 2.5% |
| 8 | 3 | 0.7% |
| 5 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 304 | |
| 1 | 105 | 23.5% |
| 2 | 13 | 2.9% |
| 4 | 9 | 2.0% |
| 3 | 7 | 1.6% |
| 8 | 5 | 1.1% |
| 5 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 107 | 24.0% |
| 2 | 14 | 3.1% |
| 3 | 12 | 2.7% |
| 4 | 11 | 2.5% |
| 5 | 1 | 0.2% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 304 | |
| 1 | 105 | 23.5% |
| 2 | 13 | 2.9% |
| 3 | 7 | 1.6% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 304 | |
| 1 | 105 | 23.5% |
| 2 | 13 | 2.9% |
| 3 | 7 | 1.6% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 107 | 24.0% |
| 2 | 14 | 3.1% |
| 3 | 12 | 2.7% |
| 4 | 11 | 2.5% |
| 5 | 1 | 0.2% |
| 8 | 3 | 0.7% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 6 | 6 |
| Distinct (%) | 1.3% | 1.3% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.41255605 | 0.38340807 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 5 | 5 |
| Zeros | 332 | 337 |
| Zeros (%) | 74.4% | 75.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 5 | 5 |
| Range | 5 | 5 |
| Interquartile range (IQR) | 1 | 0 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.85339992 | 0.78674014 |
| Coefficient of variation (CV) | 2.0685672 | 2.0519655 |
| Kurtosis | 8.9281957 | 8.4967198 |
| Mean | 0.41255605 | 0.38340807 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.7150907 | 2.5729338 |
| Sum | 184 | 171 |
| Variance | 0.72829143 | 0.61896004 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 332 | |
| 1 | 66 | 14.8% |
| 2 | 38 | 8.5% |
| 5 | 4 | 0.9% |
| 4 | 4 | 0.9% |
| 3 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 60 | 13.5% |
| 2 | 43 | 9.6% |
| 5 | 3 | 0.7% |
| 3 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 332 | |
| 1 | 66 | 14.8% |
| 2 | 38 | 8.5% |
| 3 | 2 | 0.4% |
| 4 | 4 | 0.9% |
| 5 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 60 | 13.5% |
| 2 | 43 | 9.6% |
| 3 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 60 | 13.5% |
| 2 | 43 | 9.6% |
| 3 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 332 | |
| 1 | 66 | 14.8% |
| 2 | 38 | 8.5% |
| 3 | 2 | 0.4% |
| 4 | 4 | 0.9% |
| 5 | 4 | 0.9% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 375 | 373 |
| Distinct (%) | 84.1% | 83.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.8587444 | 6.6950673 |
| Min length | 3 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 3059 | 2986 |
| Distinct characters | 31 | 35 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 324 | 323 ? |
| Unique (%) | 72.6% | 72.4% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 19988 | PC 17600 |
| 2nd row | 4137 | 1601 |
| 3rd row | W./C. 6607 | CA. 2343 |
| 4th row | 2671 | 113760 |
| 5th row | 345769 | 2678 |
| Value | Count | Frequency (%) |
| pc | 31 | 5.5% |
| c.a | 13 | 2.3% |
| a/5 | 8 | 1.4% |
| ston/o | 8 | 1.4% |
| 2 | 8 | 1.4% |
| sc/paris | 6 | 1.1% |
| 347082 | 6 | 1.1% |
| soton/oq | 5 | 0.9% |
| 3101295 | 5 | 0.9% |
| 347088 | 5 | 0.9% |
| Other values (392) | 473 |
| Value | Count | Frequency (%) |
| pc | 35 | 6.2% |
| c.a | 9 | 1.6% |
| ca | 8 | 1.4% |
| 347082 | 5 | 0.9% |
| 2343 | 5 | 0.9% |
| ston/o2 | 5 | 0.9% |
| ston/o | 5 | 0.9% |
| 2 | 5 | 0.9% |
| sc/paris | 4 | 0.7% |
| a/5 | 4 | 0.7% |
| Other values (393) | 477 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 373 | |
| 1 | 347 | |
| 2 | 299 | |
| 7 | 265 | |
| 4 | 224 | 7.3% |
| 0 | 213 | 7.0% |
| 6 | 206 | 6.7% |
| 5 | 178 | 5.8% |
| 9 | 176 | 5.8% |
| 8 | 150 | 4.9% |
| Other values (21) | 628 |
| Value | Count | Frequency (%) |
| 3 | 376 | |
| 1 | 364 | |
| 2 | 301 | |
| 7 | 252 | |
| 4 | 231 | |
| 6 | 207 | 6.9% |
| 0 | 199 | 6.7% |
| 5 | 179 | 6.0% |
| 9 | 162 | 5.4% |
| 8 | 138 | 4.6% |
| Other values (25) | 577 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3059 |
| Value | Count | Frequency (%) |
| (unknown) | 2986 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 373 | |
| 1 | 347 | |
| 2 | 299 | |
| 7 | 265 | |
| 4 | 224 | 7.3% |
| 0 | 213 | 7.0% |
| 6 | 206 | 6.7% |
| 5 | 178 | 5.8% |
| 9 | 176 | 5.8% |
| 8 | 150 | 4.9% |
| Other values (21) | 628 |
| Value | Count | Frequency (%) |
| 3 | 376 | |
| 1 | 364 | |
| 2 | 301 | |
| 7 | 252 | |
| 4 | 231 | |
| 6 | 207 | 6.9% |
| 0 | 199 | 6.7% |
| 5 | 179 | 6.0% |
| 9 | 162 | 5.4% |
| 8 | 138 | 4.6% |
| Other values (25) | 577 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3059 |
| Value | Count | Frequency (%) |
| (unknown) | 2986 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 373 | |
| 1 | 347 | |
| 2 | 299 | |
| 7 | 265 | |
| 4 | 224 | 7.3% |
| 0 | 213 | 7.0% |
| 6 | 206 | 6.7% |
| 5 | 178 | 5.8% |
| 9 | 176 | 5.8% |
| 8 | 150 | 4.9% |
| Other values (21) | 628 |
| Value | Count | Frequency (%) |
| 3 | 376 | |
| 1 | 364 | |
| 2 | 301 | |
| 7 | 252 | |
| 4 | 231 | |
| 6 | 207 | 6.9% |
| 0 | 199 | 6.7% |
| 5 | 179 | 6.0% |
| 9 | 162 | 5.4% |
| 8 | 138 | 4.6% |
| Other values (25) | 577 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3059 |
| Value | Count | Frequency (%) |
| (unknown) | 2986 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 373 | |
| 1 | 347 | |
| 2 | 299 | |
| 7 | 265 | |
| 4 | 224 | 7.3% |
| 0 | 213 | 7.0% |
| 6 | 206 | 6.7% |
| 5 | 178 | 5.8% |
| 9 | 176 | 5.8% |
| 8 | 150 | 4.9% |
| Other values (21) | 628 |
| Value | Count | Frequency (%) |
| 3 | 376 | |
| 1 | 364 | |
| 2 | 301 | |
| 7 | 252 | |
| 4 | 231 | |
| 6 | 207 | 6.9% |
| 0 | 199 | 6.7% |
| 5 | 179 | 6.0% |
| 9 | 162 | 5.4% |
| 8 | 138 | 4.6% |
| Other values (25) | 577 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 167 | 182 |
| Distinct (%) | 37.4% | 40.8% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 33.495898 | 34.731211 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 6 | 8 |
| Zeros (%) | 1.3% | 1.8% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.225 | 7.225 |
| Q1 | 7.925 | 7.925 |
| median | 14.5 | 15.5 |
| Q3 | 30.92395 | 32.596875 |
| 95-th percentile | 118.31875 | 133.65 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 22.99895 | 24.671875 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 54.314729 | 52.845168 |
| Coefficient of variation (CV) | 1.6215337 | 1.521547 |
| Kurtosis | 30.627742 | 31.953129 |
| Mean | 33.495898 | 34.731211 |
| Median Absolute Deviation (MAD) | 7.25 | 8.21875 |
| Skewness | 4.7212937 | 4.6474074 |
| Sum | 14939.171 | 15490.12 |
| Variance | 2950.0898 | 2792.6118 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 8.05 | 28 | 6.3% |
| 13 | 18 | 4.0% |
| 7.8958 | 17 | 3.8% |
| 7.75 | 16 | 3.6% |
| 7.775 | 13 | 2.9% |
| 26 | 13 | 2.9% |
| 7.2292 | 10 | 2.2% |
| 7.925 | 10 | 2.2% |
| 10.5 | 10 | 2.2% |
| 26.55 | 9 | 2.0% |
| Other values (157) | 302 |
| Value | Count | Frequency (%) |
| 13 | 23 | 5.2% |
| 8.05 | 23 | 5.2% |
| 7.8958 | 21 | 4.7% |
| 26 | 18 | 4.0% |
| 7.75 | 12 | 2.7% |
| 7.925 | 10 | 2.2% |
| 7.2292 | 9 | 2.0% |
| 7.775 | 9 | 2.0% |
| 26.55 | 8 | 1.8% |
| 0 | 8 | 1.8% |
| Other values (172) | 305 |
| Value | Count | Frequency (%) |
| 0 | 6 | |
| 4.0125 | 1 | 0.2% |
| 5 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 1 | 0.2% |
| 7.125 | 3 | |
| 7.1417 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 4.0125 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 4.0125 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 6 | |
| 4.0125 | 1 | 0.2% |
| 5 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 1 | 0.2% |
| 7.125 | 3 | |
| 7.1417 | 1 | 0.2% |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 85 | 92 |
| Distinct (%) | 84.2% | 85.2% |
| Missing | 345 | 338 |
| Missing (%) | 77.4% | 75.8% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 15 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.9306931 | 3.5833333 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 397 | 387 |
| Distinct characters | 19 | 19 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 70 | 78 ? |
| Unique (%) | 69.3% | 72.2% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | C106 | B96 B98 |
| 2nd row | B71 | C124 |
| 3rd row | C2 | C32 |
| 4th row | B37 | B4 |
| 5th row | E8 | B39 |
| Value | Count | Frequency (%) |
| f | 4 | 3.1% |
| c23 | 3 | 2.3% |
| c25 | 3 | 2.3% |
| c27 | 3 | 2.3% |
| b57 | 2 | 1.6% |
| c52 | 2 | 1.6% |
| b63 | 2 | 1.6% |
| b66 | 2 | 1.6% |
| c126 | 2 | 1.6% |
| b59 | 2 | 1.6% |
| Other values (88) | 103 |
| Value | Count | Frequency (%) |
| b96 | 3 | 2.4% |
| b98 | 3 | 2.4% |
| f2 | 3 | 2.4% |
| f | 3 | 2.4% |
| d17 | 2 | 1.6% |
| c124 | 2 | 1.6% |
| d33 | 2 | 1.6% |
| g6 | 2 | 1.6% |
| b22 | 2 | 1.6% |
| g73 | 2 | 1.6% |
| Other values (95) | 102 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 39 | |
| B | 39 | |
| 2 | 36 | 9.1% |
| 1 | 34 | 8.6% |
| 5 | 30 | 7.6% |
| 6 | 29 | 7.3% |
| 3 | 28 | 7.1% |
| 27 | 6.8% | |
| 9 | 20 | 5.0% |
| 7 | 18 | 4.5% |
| Other values (9) | 97 |
| Value | Count | Frequency (%) |
| 2 | 39 | 10.1% |
| C | 39 | 10.1% |
| 1 | 33 | 8.5% |
| B | 32 | 8.3% |
| 6 | 29 | 7.5% |
| 3 | 28 | 7.2% |
| 5 | 22 | 5.7% |
| D | 22 | 5.7% |
| 8 | 21 | 5.4% |
| 4 | 19 | 4.9% |
| Other values (9) | 103 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 397 |
| Value | Count | Frequency (%) |
| (unknown) | 387 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| C | 39 | |
| B | 39 | |
| 2 | 36 | 9.1% |
| 1 | 34 | 8.6% |
| 5 | 30 | 7.6% |
| 6 | 29 | 7.3% |
| 3 | 28 | 7.1% |
| 27 | 6.8% | |
| 9 | 20 | 5.0% |
| 7 | 18 | 4.5% |
| Other values (9) | 97 |
| Value | Count | Frequency (%) |
| 2 | 39 | 10.1% |
| C | 39 | 10.1% |
| 1 | 33 | 8.5% |
| B | 32 | 8.3% |
| 6 | 29 | 7.5% |
| 3 | 28 | 7.2% |
| 5 | 22 | 5.7% |
| D | 22 | 5.7% |
| 8 | 21 | 5.4% |
| 4 | 19 | 4.9% |
| Other values (9) | 103 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 397 |
| Value | Count | Frequency (%) |
| (unknown) | 387 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| C | 39 | |
| B | 39 | |
| 2 | 36 | 9.1% |
| 1 | 34 | 8.6% |
| 5 | 30 | 7.6% |
| 6 | 29 | 7.3% |
| 3 | 28 | 7.1% |
| 27 | 6.8% | |
| 9 | 20 | 5.0% |
| 7 | 18 | 4.5% |
| Other values (9) | 97 |
| Value | Count | Frequency (%) |
| 2 | 39 | 10.1% |
| C | 39 | 10.1% |
| 1 | 33 | 8.5% |
| B | 32 | 8.3% |
| 6 | 29 | 7.5% |
| 3 | 28 | 7.2% |
| 5 | 22 | 5.7% |
| D | 22 | 5.7% |
| 8 | 21 | 5.4% |
| 4 | 19 | 4.9% |
| Other values (9) | 103 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 397 |
| Value | Count | Frequency (%) |
| (unknown) | 387 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| C | 39 | |
| B | 39 | |
| 2 | 36 | 9.1% |
| 1 | 34 | 8.6% |
| 5 | 30 | 7.6% |
| 6 | 29 | 7.3% |
| 3 | 28 | 7.1% |
| 27 | 6.8% | |
| 9 | 20 | 5.0% |
| 7 | 18 | 4.5% |
| Other values (9) | 97 |
| Value | Count | Frequency (%) |
| 2 | 39 | 10.1% |
| C | 39 | 10.1% |
| 1 | 33 | 8.5% |
| B | 32 | 8.3% |
| 6 | 29 | 7.5% |
| 3 | 28 | 7.2% |
| 5 | 22 | 5.7% |
| D | 22 | 5.7% |
| 8 | 21 | 5.4% |
| 4 | 19 | 4.9% |
| Other values (9) | 103 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 1 |
| Missing (%) | 0.0% | 0.2% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 445 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | S | C |
| 2nd row | S | S |
| 3rd row | S | S |
| 4th row | C | S |
| 5th row | S | C |
Common Values
| Value | Count | Frequency (%) |
| S | 322 | |
| C | 89 | 20.0% |
| Q | 35 | 7.8% |
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 99 | 22.2% |
| Q | 37 | 8.3% |
| (Missing) | 1 | 0.2% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 322 | |
| c | 89 | 20.0% |
| q | 35 | 7.8% |
| Value | Count | Frequency (%) |
| s | 309 | |
| c | 99 | 22.2% |
| q | 37 | 8.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 322 | |
| C | 89 | 20.0% |
| Q | 35 | 7.8% |
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 99 | 22.2% |
| Q | 37 | 8.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 322 | |
| C | 89 | 20.0% |
| Q | 35 | 7.8% |
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 99 | 22.2% |
| Q | 37 | 8.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 322 | |
| C | 89 | 20.0% |
| Q | 35 | 7.8% |
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 99 | 22.2% |
| Q | 37 | 8.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 322 | |
| C | 89 | 20.0% |
| Q | 35 | 7.8% |
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 99 | 22.2% |
| Q | 37 | 8.3% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| Age | Embarked | Fare | Parch | PassengerId | Pclass | Sex | SibSp | Survived | |
|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.115 | 0.117 | -0.210 | 0.086 | 0.271 | 0.129 | -0.235 | 0.122 |
| Embarked | 0.115 | 1.000 | 0.199 | 0.000 | 0.000 | 0.238 | 0.000 | 0.084 | 0.129 |
| Fare | 0.117 | 0.199 | 1.000 | 0.442 | -0.024 | 0.467 | 0.210 | 0.473 | 0.275 |
| Parch | -0.210 | 0.000 | 0.442 | 1.000 | 0.038 | 0.076 | 0.305 | 0.510 | 0.165 |
| PassengerId | 0.086 | 0.000 | -0.024 | 0.038 | 1.000 | 0.062 | 0.063 | -0.045 | 0.000 |
| Pclass | 0.271 | 0.238 | 0.467 | 0.076 | 0.062 | 1.000 | 0.158 | 0.146 | 0.333 |
| Sex | 0.129 | 0.000 | 0.210 | 0.305 | 0.063 | 0.158 | 1.000 | 0.274 | 0.537 |
| SibSp | -0.235 | 0.084 | 0.473 | 0.510 | -0.045 | 0.146 | 0.274 | 1.000 | 0.211 |
| Survived | 0.122 | 0.129 | 0.275 | 0.165 | 0.000 | 0.333 | 0.537 | 0.211 | 1.000 |
Dataset B
| Age | Embarked | Fare | Parch | PassengerId | Pclass | Sex | SibSp | Survived | |
|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.006 | 0.136 | -0.247 | 0.005 | 0.260 | 0.000 | -0.202 | 0.158 |
| Embarked | 0.006 | 1.000 | 0.211 | 0.025 | 0.000 | 0.291 | 0.139 | 0.079 | 0.215 |
| Fare | 0.136 | 0.211 | 1.000 | 0.410 | -0.008 | 0.456 | 0.238 | 0.415 | 0.303 |
| Parch | -0.247 | 0.025 | 0.410 | 1.000 | -0.015 | 0.000 | 0.198 | 0.436 | 0.089 |
| PassengerId | 0.005 | 0.000 | -0.008 | -0.015 | 1.000 | 0.000 | 0.000 | -0.085 | 0.082 |
| Pclass | 0.260 | 0.291 | 0.456 | 0.000 | 0.000 | 1.000 | 0.141 | 0.153 | 0.373 |
| Sex | 0.000 | 0.139 | 0.238 | 0.198 | 0.000 | 0.141 | 1.000 | 0.186 | 0.576 |
| SibSp | -0.202 | 0.079 | 0.415 | 0.436 | -0.085 | 0.153 | 0.186 | 1.000 | 0.207 |
| Survived | 0.158 | 0.215 | 0.303 | 0.089 | 0.082 | 0.373 | 0.576 | 0.207 | 1.000 |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 298 | 299 | 1 | 1 | Saalfeld, Mr. Adolphe | male | NaN | 0 | 0 | 19988 | 30.5000 | C106 | S |
| 402 | 403 | 0 | 3 | Jussila, Miss. Mari Aina | female | 21.0 | 1 | 0 | 4137 | 9.8250 | NaN | S |
| 888 | 889 | 0 | 3 | Johnston, Miss. Catherine Helen "Carrie" | female | NaN | 1 | 2 | W./C. 6607 | 23.4500 | NaN | S |
| 832 | 833 | 0 | 3 | Saad, Mr. Amin | male | NaN | 0 | 0 | 2671 | 7.2292 | NaN | C |
| 441 | 442 | 0 | 3 | Hampe, Mr. Leon | male | 20.0 | 0 | 0 | 345769 | 9.5000 | NaN | S |
| 107 | 108 | 1 | 3 | Moss, Mr. Albert Johan | male | NaN | 0 | 0 | 312991 | 7.7750 | NaN | S |
| 671 | 672 | 0 | 1 | Davidson, Mr. Thornton | male | 31.0 | 1 | 0 | F.C. 12750 | 52.0000 | B71 | S |
| 388 | 389 | 0 | 3 | Sadlier, Mr. Matthew | male | NaN | 0 | 0 | 367655 | 7.7292 | NaN | Q |
| 151 | 152 | 1 | 1 | Pears, Mrs. Thomas (Edith Wearne) | female | 22.0 | 1 | 0 | 113776 | 66.6000 | C2 | S |
| 428 | 429 | 0 | 3 | Flynn, Mr. James | male | NaN | 0 | 0 | 364851 | 7.7500 | NaN | Q |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 793 | 794 | 0 | 1 | Hoyt, Mr. William Fisher | male | NaN | 0 | 0 | PC 17600 | 30.6958 | NaN | C |
| 169 | 170 | 0 | 3 | Ling, Mr. Lee | male | 28.00 | 0 | 0 | 1601 | 56.4958 | NaN | S |
| 324 | 325 | 0 | 3 | Sage, Mr. George John Jr | male | NaN | 8 | 2 | CA. 2343 | 69.5500 | NaN | S |
| 763 | 764 | 1 | 1 | Carter, Mrs. William Ernest (Lucile Polk) | female | 36.00 | 1 | 2 | 113760 | 120.0000 | B96 B98 | S |
| 140 | 141 | 0 | 3 | Boulos, Mrs. Joseph (Sultana) | female | NaN | 0 | 2 | 2678 | 15.2458 | NaN | C |
| 443 | 444 | 1 | 2 | Reynaldo, Ms. Encarnacion | female | 28.00 | 0 | 0 | 230434 | 13.0000 | NaN | S |
| 331 | 332 | 0 | 1 | Partner, Mr. Austen | male | 45.50 | 0 | 0 | 113043 | 28.5000 | C124 | S |
| 856 | 857 | 1 | 1 | Wick, Mrs. George Dennick (Mary Hitchcock) | female | 45.00 | 1 | 1 | 36928 | 164.8667 | NaN | S |
| 567 | 568 | 0 | 3 | Palsson, Mrs. Nils (Alma Cornelia Berglund) | female | 29.00 | 0 | 4 | 349909 | 21.0750 | NaN | S |
| 831 | 832 | 1 | 2 | Richards, Master. George Sibley | male | 0.83 | 1 | 1 | 29106 | 18.7500 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 95 | 96 | 0 | 3 | Shorney, Mr. Charles Joseph | male | NaN | 0 | 0 | 374910 | 8.0500 | NaN | S |
| 720 | 721 | 1 | 2 | Harper, Miss. Annie Jessie "Nina" | female | 6.0 | 0 | 1 | 248727 | 33.0000 | NaN | S |
| 306 | 307 | 1 | 1 | Fleming, Miss. Margaret | female | NaN | 0 | 0 | 17421 | 110.8833 | NaN | C |
| 865 | 866 | 1 | 2 | Bystrom, Mrs. (Karolina) | female | 42.0 | 0 | 0 | 236852 | 13.0000 | NaN | S |
| 66 | 67 | 1 | 2 | Nye, Mrs. (Elizabeth Ramell) | female | 29.0 | 0 | 0 | C.A. 29395 | 10.5000 | F33 | S |
| 408 | 409 | 0 | 3 | Birkeland, Mr. Hans Martin Monsen | male | 21.0 | 0 | 0 | 312992 | 7.7750 | NaN | S |
| 443 | 444 | 1 | 2 | Reynaldo, Ms. Encarnacion | female | 28.0 | 0 | 0 | 230434 | 13.0000 | NaN | S |
| 100 | 101 | 0 | 3 | Petranec, Miss. Matilda | female | 28.0 | 0 | 0 | 349245 | 7.8958 | NaN | S |
| 189 | 190 | 0 | 3 | Turcin, Mr. Stjepan | male | 36.0 | 0 | 0 | 349247 | 7.8958 | NaN | S |
| 203 | 204 | 0 | 3 | Youseff, Mr. Gerious | male | 45.5 | 0 | 0 | 2628 | 7.2250 | NaN | C |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 430 | 431 | 1 | 1 | Bjornstrom-Steffansson, Mr. Mauritz Hakan | male | 28.0 | 0 | 0 | 110564 | 26.5500 | C52 | S |
| 185 | 186 | 0 | 1 | Rood, Mr. Hugh Roscoe | male | NaN | 0 | 0 | 113767 | 50.0000 | A32 | S |
| 505 | 506 | 0 | 1 | Penasco y Castellana, Mr. Victor de Satode | male | 18.0 | 1 | 0 | PC 17758 | 108.9000 | C65 | C |
| 378 | 379 | 0 | 3 | Betros, Mr. Tannous | male | 20.0 | 0 | 0 | 2648 | 4.0125 | NaN | C |
| 683 | 684 | 0 | 3 | Goodwin, Mr. Charles Edward | male | 14.0 | 5 | 2 | CA 2144 | 46.9000 | NaN | S |
| 339 | 340 | 0 | 1 | Blackwell, Mr. Stephen Weart | male | 45.0 | 0 | 0 | 113784 | 35.5000 | T | S |
| 122 | 123 | 0 | 2 | Nasser, Mr. Nicholas | male | 32.5 | 1 | 0 | 237736 | 30.0708 | NaN | C |
| 816 | 817 | 0 | 3 | Heininen, Miss. Wendla Maria | female | 23.0 | 0 | 0 | STON/O2. 3101290 | 7.9250 | NaN | S |
| 24 | 25 | 0 | 3 | Palsson, Miss. Torborg Danira | female | 8.0 | 3 | 1 | 349909 | 21.0750 | NaN | S |
| 842 | 843 | 1 | 1 | Serepeca, Miss. Augusta | female | 30.0 | 0 | 0 | 113798 | 31.0000 | NaN | C |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||